Estimation of GMM in voice conver

نویسنده

  • Helenca Duxans
چکیده

Voice conversion consists in transforming a source speaker voice into a target speaker voice. There are many applications of voice conversion systems where the amount of training data from the source speaker and the target speaker is different. Usually, the amount of source data available is large, but it is desired to estimate the transformation with a small amount of target data. Systems based on joint Gaussian Mixture Models (GMM) are well suited to voice conversion [1], but they can’t deal with source data without its corresponding aligned target data. In this paper, two alternatives are studied to incorporate unaligned source data in the estimation of a GMM for a voice conversion task. It is shown that when a limited amount of aligned parameters are available in the training step, to only include data from the source speaker increases the performance of the voice transformation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

GMM Classifier for Identification of Neurological Disordered Voices Using MFCC Features

Automatic detection of neurological disordered subjects voice mostly relies on parameters extracted from time-domain processing. The calculation of these parameters often requires prior pitch period estimation; which in turn depends heavily on the robustness of pitch detection algorithm. In the present work cepstraldomain processing technique which does not require pitch estimation has been ado...

متن کامل

Voice activity detection using global soft decision with mixture of Gaussian model

An improvement on the voice detection algorithm using global soft decision (GSD) is made in this paper. In GSD method, the speech and noise are modelled by the presumed probability density function, e.g. Gaussian pdf. We propose that the estimation and modelling of the signal is done in the domain of filterbank output which widely used in most speech processing applications. Since the output of...

متن کامل

Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equatio...

متن کامل

Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation

The performance of voice conversion has been considerably improved through statistical modeling of spectral sequences. However, the converted speech still contains traces of artificial sounds. To alleviate this, it is necessary to statistically model a source sequence as well as a spectral sequence. In this paper, we introduce STRAIGHT mixed excitation to a framework of the voice conversion bas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003